AITopics | Tekirdag Province

Collaborating Authors

Tekirdag Province

Hyperparameter Tuning Through Pessimistic Bilevel Optimization

Ustun, Meltem Apaydin, Xu, Liang, Zeng, Bo, Qian, Xiaoning

arXiv.org Artificial IntelligenceDec-4-2024

Automated hyperparameter search in machine learning, especially for deep learning models, is typically formulated as a bilevel optimization problem, with hyperparameter values determined by the upper level and the model learning achieved by the lower-level problem. Most of the existing bilevel optimization solutions either assume the uniqueness of the optimal training model given hyperparameters or adopt an optimistic view when the non-uniqueness issue emerges. Potential model uncertainty may arise when training complex models with limited data, especially when the uniqueness assumption is violated. Thus, the suitability of the optimistic view underlying current bilevel hyperparameter optimization solutions is questionable. In this paper, we propose pessimistic bilevel hyperparameter optimization to assure appropriate outer-level hyperparameters to better generalize the inner-level learned models, by explicitly incorporating potential uncertainty of the inner-level solution set. To solve the resulting computationally challenging pessimistic bilevel optimization problem, we develop a novel relaxation-based approximation method. It derives pessimistic solutions with more robust prediction models. In our empirical studies of automated hyperparameter search for binary linear classifiers, pessimistic solutions have demonstrated better prediction performances than optimistic counterparts when we have limited training data or perturbed testing data, showing the necessity of considering pessimistic solutions besides existing optimistic ones.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2412.03666

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Middle East > Republic of Türkiye > Tekirdag Province > Tekirdag (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras

Dunion, Mhairi, Albrecht, Stefano V.

arXiv.org Artificial IntelligenceJun-21-2024

The performance of image-based Reinforcement Learning (RL) agents can vary depending on the position of the camera used to capture the images. Training on multiple cameras simultaneously, including a first-person egocentric camera, can leverage information from different camera perspectives to improve the performance of RL. However, hardware constraints may limit the availability of multiple cameras in real-world deployment. Additionally, cameras may become damaged in the real-world preventing access to all cameras that were used during training. To overcome these hardware constraints, we propose Multi-View Disentanglement (MVD), which uses multiple cameras to learn a policy that is robust to a reduction in the number of cameras to generalise to any single camera from the training set. Our approach is a self-supervised auxiliary task for RL that learns a disentangled representation from multiple cameras, with a shared representation that is aligned across all cameras to allow generalisation to a single camera, and a private representation that is camera-specific. We show experimentally that an RL agent trained on a single third-person camera is unable to learn an optimal policy in many control tasks; but, our approach, benefiting from multiple cameras during training, is able to solve the task using only the same single third-person camera.

private representation, proceedings, representation, (14 more...)

arXiv.org Artificial Intelligence

2404.14064

Country: Europe > Middle East > Republic of Türkiye > Tekirdag Province > Tekirdag (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking

Acikgoz, Emre Can, Erdogan, Mete, Yuret, Deniz

arXiv.org Artificial IntelligenceMay-7-2024

Large Language Models (LLMs) are becoming crucial across various fields, emphasizing the urgency for high-quality models in underrepresented languages. This study explores the unique challenges faced by low-resource languages, such as data scarcity, model selection, evaluation, and computational limitations, with a special focus on Turkish. We conduct an in-depth analysis to evaluate the impact of training strategies, model choices, and data availability on the performance of LLMs designed for underrepresented languages. Our approach includes two methodologies: (i) adapting existing LLMs originally pretrained in English to understand Turkish, and (ii) developing a model from the ground up using Turkish pretraining data, both supplemented with supervised fine-tuning on a novel Turkish instruction-tuning dataset aimed at enhancing reasoning capabilities. The relative performance of these methods is evaluated through the creation of a new leaderboard for Turkish LLMs, featuring benchmarks that assess different reasoning and knowledge skills. Furthermore, we conducted experiments on data and model scaling, both during pretraining and fine-tuning, simultaneously emphasizing the capacity for knowledge transfer across languages and addressing the challenges of catastrophic forgetting encountered during fine-tuning on a different language. Our goal is to offer a detailed guide for advancing the LLM framework in low-resource linguistic contexts, thereby making natural language processing (NLP) benefits more globally accessible.

bridging, dataset, semanticscholar, (14 more...)

arXiv.org Artificial Intelligence

2405.04685

Country:

North America > United States (0.14)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(13 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Education (1.00)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Visual-Policy Learning through Multi-Camera View to Single-Camera View Knowledge Distillation for Robot Manipulation Tasks

Acar, Cihan, Binici, Kuluhan, Tekirdağ, Alp, Wu, Yan

arXiv.org Artificial IntelligenceDec-2-2023

The use of multi-camera views simultaneously has been shown to improve the generalization capabilities and performance of visual policies. However, the hardware cost and design constraints in real-world scenarios can potentially make it challenging to use multiple cameras. In this study, we present a novel approach to enhance the generalization performance of vision-based Reinforcement Learning (RL) algorithms for robotic manipulation tasks. Our proposed method involves utilizing a technique known as knowledge distillation, in which a pre-trained ``teacher'' policy trained with multiple camera viewpoints guides a ``student'' policy in learning from a single camera viewpoint. To enhance the student policy's robustness against camera location perturbations, it is trained using data augmentation and extreme viewpoint changes. As a result, the student policy learns robust visual features that allow it to locate the object of interest accurately and consistently, regardless of the camera viewpoint. The efficacy and efficiency of the proposed method were evaluated both in simulation and real-world environments. The results demonstrate that the single-view visual student policy can successfully learn to grasp and lift a challenging object, which was not possible with a single-view policy alone. Furthermore, the student policy demonstrates zero-shot transfer capability, where it can successfully grasp and lift objects in real-world scenarios for unseen visual configurations.

camera view, student policy, teacher policy, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3336245

2303.07026

Country:

Europe > Middle East > Republic of Türkiye > Tekirdag Province > Tekirdag (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Translation Aligned Sentence Embeddings for Turkish Language

Unlu, Eren, Ciftci, Unver

arXiv.org Artificial IntelligenceNov-16-2023

Due to the limited availability of high quality datasets for training sentence embeddings in Turkish, we propose a training methodology and a regimen to develop a sentence embedding model. The central idea is simple but effective : is to fine-tune a pretrained encoder-decoder model in two consecutive stages, where the first stage involves aligning the embedding space with translation pairs. Thanks to this alignment, the prowess of the main model can be better projected onto the target language in a sentence embedding setting where it can be fine-tuned with high accuracy in short duration with limited target language dataset.

arxiv preprint arxiv, dataset, translation aligned sentence embedding, (9 more...)

arXiv.org Artificial Intelligence

2311.09748

Country:

Europe > Middle East > Republic of Türkiye > Tekirdag Province > Tekirdag (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Republic of Türkiye (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.99)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Entity Embeddings : Perspectives Towards an Omni-Modality Era for Large Language Models

Unlu, Eren, Ciftci, Unver

arXiv.org Artificial IntelligenceOct-27-2023

Large Language Models (LLMs) are evolving to integrate multiple modalities, such as text, image, and audio into a unified linguistic space. We envision a future direction based on this framework where conceptual entities defined in sequences of text can also be imagined as modalities. Such a formulation has the potential to overcome the cognitive and computational limitations of current models. Several illustrative examples of such potential implicit modalities are given. Along with vast promises of the hypothesized structure, expected challenges are discussed as well.

architecture, arxiv preprint arxiv, modality, (13 more...)

arXiv.org Artificial Intelligence

2310.1839

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Europe > Middle East > Republic of Türkiye > Tekirdag Province > Tekirdag (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Dive into Deep Learning -- Dive into Deep Learning 0.17.0 documentation

#artificialintelligenceJul-25-2021, 10:18:02 GMT

institute, niversity, university, (4 more...)

#artificialintelligence

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.16)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.16)
Europe > Denmark > Capital Region > Copenhagen (0.16)
(67 more...)

Industry: Education > Educational Setting > Higher Education (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback

Transfer Learning for Electricity Price Forecasting

Gunduz, Salih, Ugurlu, Umut, Oksuz, Ilkay

arXiv.org Machine LearningJul-9-2020

The task has been studied in different markets separately and learning interdependent information in between different markets is an understudied field. Recently, deep learning methods have showcased superior performance in predicting electricity prices [1]. In particular, recurrent neural networks have been able to learn sequential information in time-series type data sets [2]. Most of the the literature on the application of neural networks for electricity price forecasting has relied on single market data and available large amounts of data from different markets have not been utilized. Transfer Learning is a major tool to improve the performance on image classification problems. The networks can be trained on similar problems before finally being trained on the final problem to leverage from the data to the fullest. In this paper, we utilize the concept of transfer learning for electricity price forecasting by using data from five different markets. Our major novelties can be listed as: 1. We investigate the different ways to combine data from different elec-2 tricity markets, when training neural networks, 2. We propose the transfer learning scheme to leverage from different market data, when training recurrent neural networks (RNN) for the task of price prediction.

artificial intelligence, machine learning, price forecasting, (15 more...)

arXiv.org Machine Learning

2007.03762

Country:

Europe > Germany (0.05)
Europe > France (0.05)
Europe > Belgium (0.05)
(5 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Industry:

Energy > Power Industry (1.00)
Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Forecast monsters fed by big data

#artificialintelligenceJun-29-2018, 22:21:38 GMT

Think about how much we need cognitive technologies that enable us to make accurate estimates. There is a need for knowledge to save the potato from speculators, to predict elections and the weather accurately, i.e. cognitive computing. Will it hail, when will it hail, how long will it last, how big will the hailstones be and where will it hail in Istanbul from Tekirdağ to Kocaeli? Not Nostradamus but big data and cognitive computing technology lets you find the right answers. Weather forecasting It was not in vain that IBM bought The Weather Company, which has the world's most sensitive, precise and reliable weather data, at the beginning of 2016.

artificial intelligence, data mining, information, (17 more...)

#artificialintelligence

Country:

Europe > Middle East > Republic of Türkiye > Tekirdag Province > Tekirdag (0.25)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.25)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.25)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Information Technology (0.97)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.72)
Information Technology > Artificial Intelligence > Cognitive Science (0.57)

Add feedback

Adaptive Mixtures of Factor Analyzers

Kaya, Heysem, Salah, Albert Ali

arXiv.org Machine LearningOct-22-2015

A mixture of factor analyzers is a semi-parametric density estimator that generalizes the well-known mixtures of Gaussians model by allowing each Gaussian in the mixture to be represented in a different lower-dimensional manifold. This paper presents a robust and parsimonious model selection algorithm for training a mixture of factor analyzers, carrying out simultaneous clustering and locally linear, globally nonlinear dimensionality reduction. Permitting different number of factors per mixture component, the algorithm adapts the model complexity to the data complexity. We compare the proposed algorithm with related automatic model selection algorithms on a number of benchmarks. The results indicate the effectiveness of this fast and robust approach in clustering, manifold learning and class-conditional modeling.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

1507.02801

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback